Improving Multiclass Text Classification with Error-Correcting Output Coding and Sub-class Partitions

نویسندگان

  • Baoli Li
  • Carl Vogel
چکیده

Error-Correcting Output Coding (ECOC) is a general framework for multiclass text classification with a set of binary classifiers. It can not only help a binary classifier solve multi-class classification problems, but also boost the performance of a multi-class classifier. When building each individual binary classifier in ECOC, multiple classes are randomly grouped into two disjoint groups: positive and negative. However, when training such a binary classifier, sub-class distribution within positive and negative classes is neglected. Utilizing this information is expected to improve a binary classifier. We thus design a simple binary classification strategy via multi-class categorization (2vM) to make use of sub-class partition information, which can lead to better performance over the traditional binary classification. The proposed binary classification strategy is then applied to enhance ECOC. Experiments on document categorization and question classification show its effectiveness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Linear Discriminant Error Correcting Output Codes Using Particle Swarm Optimization

Error Correcting Output Codes reveal an efficient strategy in dealing with multi-class classification problems. According to this technique, a multi-class problem is decomposed into several binary ones. On these created sub-problems we apply binary classifiers and then, by combining the acquired solutions, we are able to solve the initial multiclass problem. In this paper we consider the optimi...

متن کامل

Improvement of Performance in Multiclass Problems by Using Biclassification Based on Error-Correcting Output Code

Error-correcting output coding (ECOC) is a widely used multicategory classification algorithm that decomposes multiclass problems into a set of binary classification problems. In this paper, we propose a new method based on a bi-classification strategy, consisting of one-vs-one and ECOC classification. Also we introduce methods to improve a standard ECOC. The proposed method is compared to othe...

متن کامل

Loss-Weighted Decoding for Error-Correcting Output Coding

The multi-class classification is a challenging problem for several applications in Computer Vision. Error Correcting Output Codes technique (ECOC) represents a general framework capable to extend any binary classification process to the multi-class case. In this work, we present a novel decoding strategy that takes advantage of the ECOC coding to outperform the up to now existing decoding stra...

متن کامل

Solving Multiclass Learning Problems via Error-Correcting Output Codes

Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hxi; f(xi)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorithms C...

متن کامل

Solving Multiclass Learning Problems viaError - Correcting Output

Multiclass learning problems involve nding a deenition for an unknown function f (x) whose range is a discrete set containing k > 2 values (i.e., k \classes"). The deenition is acquired by studying collections of training examples of the form hx i ; f (x i)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010